{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "http://cs229.stanford.edu/ps/ps1/ps1.pdf"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# (a)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Given Newton's method,\n",
    "\n",
    "\\begin{align*}\n",
    "x^{(i + 1)} &= x^{(i)} - \\frac{f(x^{(i)})}{f'(x^{(i)})} \\\\\n",
    "\\end{align*}\n",
    "\n",
    "and \n",
    "\n",
    "\\begin{align*}\n",
    "g(z) &= f(Az) \\\\\n",
    "z^{(i)} &= A^{-1}x^{(i)} \\\\\n",
    "\\end{align*}\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\\begin{align*}\n",
    "z^{(i + 1)} &= z^{(i)} - \\frac{g(z^{(i)})}{g'(x^{(i)})} \\\\\n",
    "                 &= z^{(i)} - \\frac{f(Az^{(i)})}{f'(Az) \\frac{\\partial Az}{\\partial z}} \\\\\n",
    "                 &= z^{(i)} - \\frac{f(Az^{(i)})}{A f'(Az)} \\\\\n",
    "                 &= z^{(i)} - A^{-1}\\frac{f(Az^{(i)})}{f'(Az)} \\\\\n",
    "                 &= A^{-1}x^{(i)} - A^{-1} \\frac{f(x^{(i)})}{f'(x^{(i)})} \\\\\n",
    "                 &= A^{-1}x^{(i)} - A^{-1} (x^{(i)} - x^{(i+1)}) \\\\\n",
    "                 &= A^{-1} x^{(i+1)}\n",
    "\\end{align*}\n",
    "\n",
    "So Newton's method is invariant to linear reparameterization."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# (b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Given gradient descent,\n",
    "\n",
    "\\begin{align*}\n",
    "x^{(i + 1)} = x^{(i)} - \\alpha(f'(x^{(i)}))\n",
    "\\end{align*}\n",
    "\n",
    "and\n",
    "\n",
    "\\begin{align*}\n",
    "g(z) &= f(Az) \\\\\n",
    "z^{(i)} &= A^{-1}x^{(i)} \\\\\n",
    "\\end{align*}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\\begin{align*}\n",
    "z^{(i + 1)} &= z^{(i)} - \\alpha(g'(z^{(i)})) \\\\\n",
    "                 &= z^{(i)} - \\alpha(Af'(Az^{(i)})) \\\\\n",
    "                 &= A^{-1}x^{(i)} - \\alpha(A f'(x^{(i)})) \\\\\n",
    "                 &= A^{-1}x^{(i)} - \\alpha(A \\frac{x^{(i)} - x^{(i+1)}}{\\alpha}) \\\\\n",
    "                 &= A^{-1}x^{(i)} - A x^{(i)} + Ax^{(i+1)}\n",
    "\\end{align*}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Hence gradient descent is not invariant to linear reparameterization.\n",
    "\n",
    "It seems the key difference of Netwon's method from gradient descent is that $g'(z)$ is in the denominator, which lead to the presence of $A^{-1}$ and let $A^{-1}x{(i)}$ cancel out. This is not the case in gradient descent."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
